[CK] Fix FMHA sink dispatch when init_sink_value is set#7530
Merged
Conversation
traits.has_sink only checked mask.sink > 0, missing the case where init_sink_value is set via -init_sink=1. This caused the GPU kernel to dispatch without sink support, leading to FMHA test failures.
run_sink_mask_tests (StreamLLM) and run_sink_init_tests (GPT-OSS) require kernel instances compiled with sink=true. The default BUILD_TESTING filter excludes these kernels, causing "not supported yet" failures. Gate each behind -m and -g flags respectively, consistent with the existing -s (splitkv) and -a (appendkv) pattern.
asleepzzz
approved these changes
May 18, 2026
assistant-librarian Bot
pushed a commit
to ROCm/composable_kernel
that referenced
this pull request
May 18, 2026
[CK] Fix FMHA sink dispatch when init_sink_value is set (#7530) ## Summary - Fix `traits.has_sink` in `fmha_fwd_runner.hpp` to also check `init_sink_value != 0`, so the GPU kernel dispatches with sink support when `-init_sink=1` is passed. - Gate `run_sink_mask_tests` (StreamLLM) and `run_sink_init_tests` (GPT-OSS) behind opt-in flags `-m` and `-g` in `smoke_test_fwd.sh`. These tests require sink=true kernel instances which are excluded by the `BUILD_TESTING` CMake filter (`*_nsink*`), causing unconditional "not supported yet" failures (48 tests in CI). The opt-in flag approach was borrowed from PR #6057. ## Why gate tests instead of compiling sink=true kernels? The `BUILD_TESTING` filter in `CMakeLists.txt` uses `*_nsink*` glob patterns for the `fwd` and `fwd_splitkv` APIs, excluding sink=true kernel instances from compilation. We chose opt-in flags over widening the filter because: - **Compile time**: Enabling sink=true kernels doubles the kernel variants for `fwd` and `fwd_splitkv` APIs. The filter exists specifically to reduce CI build times. - **Incremental enablement**: Sink support (StreamLLM / GPT-OSS) is still maturing. Gating lets teams opt in explicitly (`smoke_test_fwd.sh -g`) while keeping the default CI path fast. - **Precedent**: splitkv (`-s`) and appendkv (`-a`) tests already follow this opt-in pattern. ## Test plan - [ ] Run `smoke_test_fwd.sh -g` with sink=true kernels compiled and verify sink-enabled kernels are dispatched - [ ] Verify `smoke_test_fwd.sh` still passes without `-m` / `-g` flags - [ ] Confirm CI no longer fails on sink tests (they are now opt-in)
aledudek
pushed a commit
that referenced
this pull request
May 20, 2026
## Summary - Fix `traits.has_sink` in `fmha_fwd_runner.hpp` to also check `init_sink_value != 0`, so the GPU kernel dispatches with sink support when `-init_sink=1` is passed. - Gate `run_sink_mask_tests` (StreamLLM) and `run_sink_init_tests` (GPT-OSS) behind opt-in flags `-m` and `-g` in `smoke_test_fwd.sh`. These tests require sink=true kernel instances which are excluded by the `BUILD_TESTING` CMake filter (`*_nsink*`), causing unconditional "not supported yet" failures (48 tests in CI). The opt-in flag approach was borrowed from PR #6057. ## Why gate tests instead of compiling sink=true kernels? The `BUILD_TESTING` filter in `CMakeLists.txt` uses `*_nsink*` glob patterns for the `fwd` and `fwd_splitkv` APIs, excluding sink=true kernel instances from compilation. We chose opt-in flags over widening the filter because: - **Compile time**: Enabling sink=true kernels doubles the kernel variants for `fwd` and `fwd_splitkv` APIs. The filter exists specifically to reduce CI build times. - **Incremental enablement**: Sink support (StreamLLM / GPT-OSS) is still maturing. Gating lets teams opt in explicitly (`smoke_test_fwd.sh -g`) while keeping the default CI path fast. - **Precedent**: splitkv (`-s`) and appendkv (`-a`) tests already follow this opt-in pattern. ## Test plan - [ ] Run `smoke_test_fwd.sh -g` with sink=true kernels compiled and verify sink-enabled kernels are dispatched - [ ] Verify `smoke_test_fwd.sh` still passes without `-m` / `-g` flags - [ ] Confirm CI no longer fails on sink tests (they are now opt-in)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
traits.has_sinkinfmha_fwd_runner.hppto also checkinit_sink_value != 0, so the GPU kernel dispatches with sink support when-init_sink=1is passed.run_sink_mask_tests(StreamLLM) andrun_sink_init_tests(GPT-OSS) behind opt-in flags-mand-ginsmoke_test_fwd.sh. These tests require sink=true kernel instances which are excluded by theBUILD_TESTINGCMake filter (*_nsink*), causing unconditional "not supported yet" failures (48 tests in CI). The opt-in flag approach was borrowed from PR [CK Tile][FMHA] Decouple StreamingLLM and GPT-OSS sink into independent compile-time modes #6057.Why gate tests instead of compiling sink=true kernels?
The
BUILD_TESTINGfilter inCMakeLists.txtuses*_nsink*glob patterns for thefwdandfwd_splitkvAPIs, excluding sink=true kernel instances from compilation. We chose opt-in flags over widening the filter because:fwdandfwd_splitkvAPIs. The filter exists specifically to reduce CI build times.smoke_test_fwd.sh -g) while keeping the default CI path fast.-s) and appendkv (-a) tests already follow this opt-in pattern.Test plan
smoke_test_fwd.sh -gwith sink=true kernels compiled and verify sink-enabled kernels are dispatchedsmoke_test_fwd.shstill passes without-m/-gflags